1. Introduction
Accurate and up-to-date information on the status of forest resources (extent, location, type) is crucial for sustainable forest management [
1], international commitments under the climate convention [
2], and accurate reporting of forest carbon content and carbon dioxide sequestration [
3,
4]. Additionally, information on the spatial distribution of forest and forest composition is essential for biodiversity assessment and monitoring and forest ecosystem conservation and protection [
5,
6,
7]. The process of forest management varies in terms of the spatial level of management, level of planning and management objectives [
8]. Different techniques and data can be used for the forest management at, for example, the tree, stand, landscape, regional and national levels. Traditionally, the forest inventory relies on statistical methods and field-based forest plot inventories, which are costly, time consuming, and spatially restricted. The field inventory data collected at the sampling plot level is usually aggregated to the forest stand level with no information about the spatial variability of forest tree species within the stand. The traditional forest inventory over larger geographical extent would be very costly and not feasible to undertake in a relatively short period of time. The advanced satellite-based remote sensing techniques can support a traditional forest inventory. It can help to produce wall-to-wall forest maps, showing the forest type and tree species distribution at the regional and national levels.
The situation is more complex in countries like Poland, where around 80.8% of the forest is publicly owned, including that administered by the State Forests National Forest Holding (77%) [
9]. The State Forests are managed according to the forest management plans, which are updated every 10 years, using the traditional sampling methods and taking into account the goal of having a sustainable forest economy. The forest inventory data are well recorded, aggregated to the stand level, and stored in the National Forest Database. In contrast, the privately owned forest is not that well inventoried or recognised. The privately owned forest accounts for around 20% of the total forest area in Poland and is managed by the local administration, specifically the district governor (at the second administrative level, NUTS 4). The governor acts as the controlling body for the private owners. The governor is responsible for the preparation and verification of the simplified forest management plans and for the continuous monitoring of forest areas. The major problems identified around the management of the privately-owned forest are i) a lack of any or up-to-date simplified forest management plans and inventory data, ii) disagreement about the forest status between on the ground and official cadastral records, iii) lack of up-to-date information on the forest extent, and iv) the high cost of the traditional forest inventory [
10]. Several studies discussed a disagreement concerning forest cover in Poland between land cadastre and status on the ground [
11,
12,
13]. Hoscilo et al. [
11] analysed the forest extent at the national level based on various spatial databases (i.e., Digital Forest Map, covering explicitly the State Forests, Topographic Database, Database of Parcel Identification System, Copernicus High Resolution Forest Layer, and National Forest Data Bank). The authors stressed a large discrepancy between the forest area in official cadastral land records and forest cover on the ground. According to this study, the forest area at the country level is larger by almost 800,000 ha compared to the official statistics. This discrepancy is partly caused by the unrecorded natural forest succession, which is particularly visible in the mountains [
14,
15,
16]. At the moment, tools are lacking that would help the local authorities responsible for non-state forest management to accurately assess the current status of the forest. The local authorities defined the detailed, accurate and up-to-date forest/non-forest mapping and forest type delineation as one of the most essential products to support the non-state forest management in remote mountain areas.
Even though there are several pan-European and global satellite-based forest products available, none of them meets the user requirements. The global forest change map is provided since the year 2000 based by a time series of Landsat data [
17]. The pan-European forest cover maps were produced, for example, by the Joint Research Centre for the year 2000 (Forest Cover Map at 25 m spatial resolution), by the European Forest Institute (EFI) for the year 2006 (Forest Map of Europe at 1 × 1 km spatial resolution). More up-to-date is the High-Resolution Forest Layer available as part of the Copernicus Land Monitoring Service. The Copernicus High Resolution Forest Layer contains information on the tree cover density and dominant leaf type (coniferous and broadleaf) at a spatial resolution of 20 × 20 m and is available for the year 2012 and 2015. The Copernicus High Resolution Forest Layer (HRL 2012), however, revealed to have the highest overestimation error of 7.5% compared to the other national datasets used to derive the forest extent at the country level [
11]. In addition, the statistical verification of the Copernicus High Resolution Layer–Dominant Leaf Type 2015 showed the large overestimation of broadleaf forest at the expense of the coniferous forest. The commission error for the broadleaf forest was equal to 8.6% [
18]. The additional limitation of the HRL data is the time lag in the data provision, currently the HRL 2012 and 2015 are available. The operational forest management requires more detailed, accurate, and more frequently updated data on the forest status.
The use of Earth Observation data, in particular the freely available satellite data, and advanced remote sensing techniques provide a cost-effective approach to obtain systematic, wall-to-wall information on the forest cover and forest type. There are many examples of forest-related studies and applications derived based on a time series of Landsat data [
19,
20,
21,
22,
23]. A study by Zhu and Liu [
20] confirmed the advantages of a dense seasonal Landsat time-series, compared to a single image in the classification of forest types. The authors achieved the highest accuracy by using the hierarchical approach, first to delineate the broadleaf forest and then to classify it into the oak forest and mixed-mesophytic forest. The best overall accuracy of 92.6% and a kappa coefficient of 0.85 were obtained using the combination of a time series of Landsat data and topography. The importance of phenological information contained in multi-seasonal images for forest type mapping was also stressed by other studies [
24,
25,
26,
27].
The launch of the European Sentinel-2 satellites opened a new era in the application of freely available, remote sensed data in forestry. Recently, there has been an increase in research publications focused on the use of the Sentinel-2 data. The great advantage of the Sentinel-2 data is the higher spatial, temporal and spectral resolution compared to the Landsat series. Sentinel-2 offers 13 spectral bands, including four visible and near infrared bands at a 10 m spatial resolution, four bands in red-edge and two bands in the shortwave infrared spectrum available at a 20 m spatial resolution [
28]. The launch of the twin Sentinel-2 A and B satellites, which move on the same orbit, increased the revisit time to five days, which resulted in a higher probability of getting a cloud free image. Another big advantage of this mission is the wide swath of 290 km (more than 100 km wider than the Landsat satellite), which makes it an ideal sensor for forest analyses over a large area at the regional and national scales. A few studies have focused on the use of the Sentinel-2 data for the classification of tree type and tree species [
26,
29,
30,
31,
32,
33]. Wessel et al. [
30] analysed the separation of beech from oak trees in a managed forest in Bavaria. The authors applied the hierarchical classification approach, first to separate the forest types (coniferous and broadleaf) and then to classify beech and oak trees within the broadleaf forest mask. They tested the object- and pixel-based approaches and two machine learning algorithms, random forest (RF) and support vector machines (SVM), and concluded that there is no difference between the approaches. The SVM performed slightly better than RF but this could have been due to the training data used. Immitzer et al. [
32] classified six tree species (four coniferous and two broadleaf) in the Bavarian forest and achieved rather low results with an overall accuracy equal to 66%. A recent study by Persson et al. [
34] demonstrated that the highest overall accuracy (88.2%) in the discrimination of tree species can be obtained using all bands from the multi-temporal Sentinel-2 imagery. The authors applied the RF method to classify five common tree species: Spruce, pine, larch, birch and oak in a mature forest in Central Sweden. The studies described above were focused at the local scale or on a single forest estate. A limited number of studies have addressed the forest status and composition over a larger area at the regional and national levels [
35]. The application of the satellite-based remote sensing methods for forest mapping over large mountain areas is very limited. A recent study by Liu et al. [
26] demonstrated the potential of freely-accessible multi-source remote-sensed data in forest type mapping over a flat and mountain area (up to 850 m a.s.l.) at the large regional scale in south-central China. The authors studied the combination of the multi-temporal Landsat-8, Sentinel-2 and SRTM digital elevation model (DEM) data for the classification of four tree species and for mixed forest types. They achieved the highest accuracy of 82.8% by combining sensors with terrain features. The overall accuracy increased by 15.2% by adding DEM, compared to a single image. Dorren et al. [
22] also stressed that topographic information, such as the DEM or features derived from a DEM in combination with spectral data, can improve the results of forest type classification in the steep mountain terrain (ranging from 600 m to 3000 m a.s.l.) in Austria. They found that both the topographic correctness and classification with the DEM as additional bands increase the accuracy of the classification. The complexity of the mountain terrain, variation in the surface illumination between shaded and illuminated areas influence the accuracy of the classification. This issue was addressed by a recent study of Isuhuaylas et al. [
36]. The authors compared the performance of various machine-learning approaches: SVM, RF and k-Nearest Neighbor (kNN) for classification of the Andes mountain forest using a time series of Landsat-8 data and DEM. The authors concluded that the SVM and RF methods gave a similar accuracy in the separation of the mountain forest from shrublands; the kNN was more sensitive to the noise training data.
The main objectives of this study are: i) To examine the potential of the multi-temporal Sentinel-2 data and its combination with topographic variables (DEM, slope, aspect) for mapping the forest/non-forest cover and forest type, ii) to identify eight tree species: Beech, oak, alder, birch, spruce, pine, fir and larch over a large mountain area at the regional scale. We investigated the impact of the forest type stratification on the results of tree species classification following two approaches: i) All species were classified together within the forest mask, and ii) broadleaf and coniferous tree species were classified separately within the forest type masks. The study site was located in the mountain terrain and a part of it is occupied by the Nowy Targ district, which is used as an example for the application of remote-sensed techniques in operational management processes.
4. Discussion
The study demonstrated the potential of the multi-temporal Sentinel-2 data combined with the topographic information for the operational mapping of the forest cover, forest type and detailed delineation of common tree species on a large regional scale. It has to be stressed that the number of studies focusing on forest type and tree species identification at the larger geographical scale is rather limited [
35]. The wide swath of the Sentinel-2 mission, the higher spatial resolution of 10–20 m, and the revisit cycle of five days has a good basis for mapping the forest composition over large areas.
Compared to the results of other studies for classifying the forest cover, forest type and tree species, our results are comparable or more accurate. Liu et al. [
26] mapped the forest cover in China applying an object-based random forest algorithm to single, multi-temporal and multi-sensed data. They achieved the highest accuracy of 99.3% by combining Sentinel-2 data with topographic information. The classification performed using only the Sentinel-2 data gave also satisfactory results of 97.2% of overall accuracy. This is in line with the results for forest cover obtained in our study. The classification of the forest and non-forest cover provided a high overall accuracy of 98.2% for multi-temporal Sentinel-2 with and without the topographic information. The overall accuracy declined slightly to 94.8% for the delineation of coniferous and broadleaf forest types. Overall, the F1 value for coniferous forest was 1.8% higher than the broadleaf forest (95.6% and 93.8%, respectively). In a study carried out in a German-managed forest (based on maximum likelihood classification (MLC) of multi-temporal Spot-4/5 and RapidEye images), Stoffels et al. [
58] also achieved slightly better results for the coniferous forest (F1: 91%) compared to the broadleaf forest (F1: 90.4%). However, they achieved a lower overall accuracy of 90.7% compared to our results (94.8%).
Of interest, the use of three topographic variables—DEM, slope and aspect—did not increase the accuracy of the classification of the forest/non-forest cover and forest type (broadleaf and coniferous). In both cases, the classification results obtained with or without topographic information were comparable (
Table 2 and
Table 3). Our results confirmed that spectral bands derived from the multi-temporal Sentinel-2 data are sufficient to accurately map forest cover and to separate the broadleaf forest from coniferous forest over a large area. This finding was also proven by the study of Zhu and Liu [
20] conducted in the second-growth forest in Vinton County, Ohio, USA. The authors concluded that the multi-temporal Landsat images are enough to distinguish broad land cover classes (including the broadleaf and coniferous forest). This research outcome should be taken into account for operational forest and forest type mapping at the regional and national scales.
In contrast, the importance of the topographic variables increased significantly in the process of tree species delineation. In our study, by combining the DEM, slope and aspect with the multi-temporal Sentinel-2 data, the classification of eight tree species improved from 75.6% (multi-temporal Sentinel-2) to 81.7% (classification without stratification, where all tree species were classified together) and reached the highest accuracy of 89.5% for the stratified approach. Liu et al. [
26] demonstrated the importance of slope derived from SRTM DEM in the RF classification of four common tree species and four mixed forest types in part of China. The authors achieved the best accuracy of 82.8% by combining various spectra, textural feature derived from Sentinel-2, multi-temporal Landsat images, Sentinel-1 VV data and topographic information. The overall accuracy was approximately 13% higher compared to the classification using the combination of Sentinel-2 and topographic features (69.5%). The lower accuracy of combination of Sentinel-2 and topographic features, compared to our study, may be due to the selection of the Sentinel-2 data. Liu et al. [
26] used the multi-spectral Sentinel-2 data collected in the leaf-on season only, whereas the Landsat images represented different vegetation seasons. In our study, we used four Sentinel-2 scenes representing change in phenology. According to Liu et al. [
26], the DEM, followed by aspect, was listed as the least important variable. This finding is contradictory to the result of our study, where the DEM showed the highest impact on the classification of tree species, followed by slope. This disagreement could perhaps be explained by the differences in vertical zoning in the mountain forest. For example, the presence of the spruce forest is highly dependent on elevation, it is a characteristic for the upper timber zone and reaches the timberline, which may explain the high importance of the elevation in the classification of the coniferous forest. The F1 accuracy of the classification of spruce forest increased by 10.3% by adding the topographic features, this was the highest increase amongst all the analyzed tree species. The DEM demonstrated to be more important in the separation of the coniferous than broadleaf species. The dependence of the broadleaf species in particular beech, birch and oak on the elevation is visible in
Figure 5. Amongst the broadleaf tree species, birch and oak showed the highest accuracy improvement of around 9.6% by combining the spectral data with the topographic features. The importance of DEM as an additional variable in the classification of forest types by a MLC classifier in steep mountainous terrains (ranging from 600 m to over 3000 m a.s.l.) in Austria was highlighted by Dorren et al. [
22]. The authors improved the classification of four forest types from 64% to 73% by combining a DEM with Landsat images.
The results of our study demonstrated that the multi-seasonal images in combination with the topographic features are able to discriminate different tree species. The stratified, hierarchical approach to tree species classification examined in this study provided more accurate results compared to the non-stratified method. We first separated forest from non-forest, then divided forest into the broadleaf and coniferous forest and finally, tested whether the stratification by forest type improved the results of the tree species separation. The difference in accuracy between the two approaches was more pronounced for the broadleaf species than the coniferous species. The classification accuracy increased from 81.7% for the non-stratified approach to 89.5% for the broadleaf and 82% for the coniferous species following the stratified method. Comparing the performance of the classification of individual tree species, we obtained following the stratified approach, the highest user’s accuracy for oak (95.1%) and beech (92.3%) followed by birch (90.6%) and alder (83.1%). Wessel et al. [
30] applied the stratified approach to delineate the oak and beech species in the Bavarian forest (using Sentinel-2). They first mapped the broadleaf forest, then classified the tree species within the broadleaf forest and achieved a high user accuracy of 94% for beech and 100% for oak species. Zhu et al. [
20] proposed a hierarchical classification approach to classify first the broad land cover and then the detailed forest type within the broadleaf forest. They applied this approach to the Landsat time series using the SVM method and achieved the user’s accuracy of 92% for oak trees.
Our results were more accurate than those achieved using the non-stratified object-based approach based on Sentinel-2 by Immitzer et al. [
32] for beech (73.8%) and oak (46.7%). The classification of tree species based on high resolution SPOT and RapidEye data by Stoffels et al. [
58] also provided lower accuracy for oak (84%) and beech (79.5%). These values were more in line with the accuracy of the tree species classified together without stratification presented in our study. Lower accuracy for the classification of beech and oak (69% and 83%) using very high resolution WorldView images was also reported by Waser et al. [
59]. Slightly higher user’s accuracy for beech (86.9%), oak (85.4%), birch (88.6%) and alder (79.6%) was achieved by Immitzer et al. [
60] using the object-based random forest classification of the WorldView-2 satellite data in the East of Austria. Regarding the separation of coniferous tree species, the best user’s accuracy was obtained following the stratified approach for spruce (85%) and pine (84.1%), followed by larch and fir, with the user’s accuracy reaching almost 80%. The results of Stoffels et al. [
58] were slightly better for spruce (91.6%) and comparable for pine and fir. The slightly lower or comparable results for spruce (80.4%), pine (85.1%), larch (70.4%), fir (82.3%) were obtained by Immitzer et al. [
60]. Waser et al. [
59], obtained a higher user’s accuracy for fir (85%) and larch (87%) compared to our results. Much lower accuracy was achieved based on Sentinel-2 data by Immitzer et al. [
32], where spruce reached a user’s accuracy of 77%, fir 71%, larch 64% and pine 60%.
Analyzing the importance of variables used in the classification process, we observed a discrepancy between the coniferous and broadleaf tree species. The visual 10 m Sentinel-2 bands, in particularly red and green bands (B4 and B3) followed by two SWIR bands (B12 and B11), were the most important for the separation of four coniferous tree species. In contrast, the visible bands were less important in the classification of broadleaf tree species, except for the red band (B4) derived from images acquired in autumn and spring. The high relevance of the visible bands is related to the absorption of photosynthetic pigment chlorophyll a and b [
35]. The red-edge bands (B5, B6 and B7) and two SWIR bands (B12 and B11) contributed the most to the results of the broadleaf tree species classification. The importance of red-edge bands in the classification of vegetation and separation of broadleaf species was highlighted by other studies [
30,
61]. Immitzer et al. [
32] also found two SWIR (B11 and B12), one red-edge (B5) and two visible (B2, B4) bands to be the most important in the classification of six tree species. The importance of the SWIR bands (B11 and B12) in the separation of four tree species (two coniferous and two broadleaf) and four mixed forest types was also pointed out by Liu et al. [
26].
The rank of the variable importance showed relatively low contributions of several spectral bands of the early autumn image (2 October); however, there were two very important spectral bands: red-edge band (B5) and red band (B4), which contributed significantly to the separation of broadleaf and coniferous species. This suggests that the phenology is the important aspect in the classification of tree species, particularly in the broadleaf forest. Since plant phenology varies with species, it is important to select images representing the phonological cycle of the studied tree species [
21,
62]. Changes in vegetation are most observable in spring as greening-up leaves and the intensive green color of needles and in autumn as the coloring of leaves due to leaf senescence. However, the low sun elevation angle in very early spring and late autumn images may reduce the accuracy of the classification. The terrain correction partly compensates illumination conditions for slopes facing towards or away from the sunlight but it cannot eliminate this effect completely [
63]. This is particularly important for the study carried out over the complex, steep mountain areas. In our study, even the mountains were not steep, some pixels on the image acquired on the 12 October, were assigned as “dark area pixels”, which was related to the effect of the low sun elevation angle. We observed that the terrain correction improved the illumination condition over these areas. The use of the multi-temporal datasets in the areas where seasonal effect differ among the species can support the separation of the tree species and minimize illumination effects on the classification results.
It has to be highlighted that the accuracy of the classification strongly depends on the quality of the reference samples. There are a few issues to point out. First, the stand-based forest inventories provide information aggregated to the forest stand level with no information about the spatial variability of dominant tree species within the stand. Second, due to the dynamic changes of the forest status, the inventory data is not always up-to-date. In the case of our study, due to an outbreak of bark beetles, the spruce forest became very fragmented, and some of the randomly selected reference samples had to be moved to the remaining patches of nearby spruce forest within the homogenous spruce stand. Third, the information assigned to the stands can be incorrect, we observed that sometimes, the small fragments of the coniferous forest were present inside the homogenous broadleaf stands. On the other hand, with the sample-based inventories, the position of the sampling plots may be not always accurate. Therefore, the visual verification of the reference samples is essential to assure the quality of the training and validation dataset. Finally, it is important for the mountain terrain to have the reference samples covering the full elevation range for particular species. In our study, the small patch of the mountain pine was misclassified as spruce due to the underrepresentation of this species in the reference samples.
In general, the Sentinel-2 mission is perfectly designed for large-scale analysis. Due to a wide swath and dense series of the Sentinel-2 data, it is feasible to derive up-to-date forest cover and forest type maps at a high spatial resolution on an annual basis. The use of the national reference datasets, multi-temporal Sentinel-2 data and algorithms tuned to the specific area allows achieving a higher accuracy and high resolution (10 m) products compared to those currently provided by the pan-European Copernicus High Resolution Forest Layers (spatial resolution of 20 m). Additionally, the time lag in the HRL data provision is the limiting factor for the regional management purposes because it does not allow tracking the dynamic changes. However, the Copernicus High Resolution Layers are a relevant dataset for the national purposes to derive, for example, more detailed information on the land cover than those provided by the Corine Land Cover databases.
The results presented in this study demonstrate the ability to provide highly accurate, detailed and up-to-date forest maps over a large area at the regional scale. This information is crucial for local administration to improve the forest management, especially in remote areas. The Nowy Targ district is a good example for the use of the remote-sensed data in operational management processes. Knowledge on the actual forest extent can help to identify the areas with discrepancy between the official cadastre data and forest on the ground. The forest cover map derived as part of this study provides information on the actual forest extent within the district. The forest covers an area of 65,500 ha, which accounts for 44.4% of the Nowy Targ district area. This is almost 11,000 ha more than the forest area reported by official statistics [
39]. Another advantage of using the Earth Observation data and techniques is that we can map and monitor forest resources, regardless of land ownership. Our study demonstrated that the satellite-based products could support traditional forest inventories by providing a large-scale wall-to-wall map of forest extent, forest type or species distribution. The final satellite-based forest cover and forest type maps are incorporated in the web-based SAT4EST application, which is designed to support the local authorities in managing and controlling the activities over the non-state forest. The application is in the pre-operational phase.