Automatic Windthrow Detection Using Very-High-Resolution Satellite Imagery and Deep Learning

: Wind disturbances are signiﬁcant phenomena in forest spatial structure and succession dynamics. They cause changes in biodiversity, impact on forest ecosystems at di ﬀ erent spatial scales, and have a strong inﬂuence on economics and human beings. The reliable recognition and mapping of windthrow areas are of high importance from the perspective of forest management and nature conservation. Recent research in artiﬁcial intelligence and computer vision has demonstrated the incredible potential of neural networks in addressing image classiﬁcation problems. The most e ﬃ cient algorithms are based on artiﬁcial neural networks of nested and complex architecture (e.g., convolutional neural networks (CNNs)), which are usually referred to by a common term—deep learning. Deep learning provides powerful algorithms for the precise segmentation of remote sensing data. We developed an algorithm based on a U-Net-like CNN, which was trained to recognize windthrow areas in Kunashir Island, Russia. We used satellite imagery of very-high spatial resolution (0.5 m / pixel) as source data. We performed a grid search among 216 parameter combinations deﬁning di ﬀ erent U-Net-like architectures. The best parameter combination allowed us to achieve an overall accuracy for recognition of windthrow sites of up to 94% for forested landscapes by coniferous and mixed coniferous forests. We found that the false-positive decisions of our algorithm correspond to either seashore logs, which may look similar to fallen tree trunks, or leaﬂess forest stands. While the former can be rectiﬁed by applying a forest mask, the latter requires the usage of additional information, which is not always provided by satellite imagery. Weather_archive_in_Yuzhno-Kurilsk). This study extends upon the experience of using U-Net-like CNNs for problems in windthrow patch recognition areas (the areas covered with uprooted and wind snapped trees), demonstrating the peculiarities of exploiting high-resolution satellite imagery. disturbance in the subalpine our forest studies and only minor damage in the with


Introduction
Storm winds are the main factor of natural forest damage. Wind disturbances are an important component of forest ecosystem dynamics at different spatial scales [1][2][3][4][5]. It is assumed that global changes can increase the frequency and strength of storms making the impact of winds a more significant factor for forests [6][7][8][9][10][11]. The identification and positioning of windthrow areas along with the essential determination of their areas using satellite imagery are of high importance for purposes of forest management and nature conservation. These problems are also closely related to the carbon balance [12,13], estimation of the fire risk [14], bark beetle outbreaks [15], and management of salvage logging [16,17]. Of particular importance are remote sensing methods in areas with complex terrain and poorly developed infrastructure because such territories are greatly limited with regard to the ability to conduct ground-based surveys.
The commonly used methods for assessing forest disturbances consist of analyzing the multitemporal remote sensing data of pre-and post-disturbance images. For example, this approach is  Table 1).
Within Kunashir Island, the majority of the forested landscape is characterized by dark and mixed coniferous forests with dominated by Sakhalin fir (Abies sachalinensis (F. Schmidt) Mast.), Yezo spruce (Picea jezoensis (Siebold & Zucc.) Carrière), and Sakhalin spruce (Picea glehnii (F. Schmidt) Mast.). Dark coniferous forest typified by a moderate to closure canopy layer of needle-leaved evergreen trees on average not exceeding 20 m in height. The average diameter at breast height (DBH) of adult trees is 30 cm, with a maximum of more than 70 cm. Broad-leaved cold-deciduous trees may form a scattered subcanopy, particularly in canopy gaps. Stone birch (Betula ermanii Cham.) and Mongolian oak (Quercus mongolica Fisch. ex Ledeb.) stand with an understory layer of dwarf bamboos (Sasa spp.) that encompass the seral forest communities. Stands of this forest typically contain closure canopy (90-100%) cover of not tall trees approaching 10-12 m with DBH less than 20 cm. There is not often a sparse layer of old-growth big birches and oaks with DBH more than 100 cm. Stone birch communities (including krummholz) and Siberian dwarf pine thickets (Pinus pumila (Pall.) Regel) also occupy the subalpine vertical belt in the mountains. In December 2014, a violent storm occurred, characterized by strong winds and high precipitation composed of wet snow caused windthrows in the dark coniferous forests over large areas of the island. We found no wind disturbance in the subalpine forests during our ground-based forest studies and only minor damage in the birch oak forest with dwarf bamboo.  Table 1).
Within Kunashir Island, the majority of the forested landscape is characterized by dark and mixed coniferous forests with dominated by Sakhalin fir (Abies sachalinensis (F. Schmidt) Mast.), Yezo spruce (Picea jezoensis (Siebold & Zucc.) Carrière), and Sakhalin spruce (Picea glehnii (F. Schmidt) Mast.). Dark coniferous forest typified by a moderate to closure canopy layer of needle-leaved evergreen trees on average not exceeding 20 m in height. The average diameter at breast height (DBH) of adult trees is 30 cm, with a maximum of more than 70 cm. Broad-leaved cold-deciduous trees may form a scattered subcanopy, particularly in canopy gaps. Stone birch (Betula ermanii Cham.) and Mongolian oak (Quercus mongolica Fisch. ex Ledeb.) stand with an understory layer of dwarf bamboos (Sasa spp.) that encompass the seral forest communities. Stands of this forest typically contain closure canopy (90-100%) cover of not tall trees approaching 10-12 m with DBH less than 20 cm. There is not often a sparse layer of old-growth big birches and oaks with DBH more than 100 cm. Stone birch communities (including krummholz) and Siberian dwarf pine thickets (Pinus pumila (Pall.) Regel) also occupy the subalpine vertical belt in the mountains. In December 2014, a violent storm occurred, characterized by strong winds and high precipitation composed of wet snow caused windthrows in the dark coniferous forests over large areas of the island. We found no wind disturbance in the subalpine forests during our ground-based forest studies and only minor damage in the birch oak forest with dwarf bamboo.

Satellite Data
Source data were collected from 10 sites in different parts of Kunashir Island. Seven of these fell into the category of forest landscape with different types of forests and include windthrow areas. Three plots did not include forested areas or windthrow patches (Table 1). We used very-high-resolution imagery, without clouds, for all of these sites, which came from the satellites Pleiades-1A/B (five images) and Worldview-3 (one image). The spatial resolution of the pansharpened images obtained from the Pleiades-1A/B satellite sensors was 0.5 m/pixel [51]. These images were provided without atmospheric corrections and encoded in RGB-colorspace (Supplementary Materials, Figures S1-S10). However, due to the relatively cloudy conditions of oceanic climate in the study area, it was impossible to choose all images from only a single satellite camera as this led to images of slightly different quality and resolution. Images of the study sites have a resolution of 2500 × 2500 pixels (~1 km × 1 km on the surface in Pleiades-1A/B images), except for the sites #4 and #9, where the images are smaller and have a resolution of 1250 × 1250 pixels (~500 m × 500 m). Further, these images were used in training and validation datasets by randomly and repeatedly cropping sub-images having a resolution of 256 × 256 pixels (~128 × 128 m).
In addition, we used a specific snapshot by WorldViev-3 with a resolution of 0.31 m/pixel [52]. This snapshot was taken on the same date as most of the Pleiades-1A/B images and did not contain windthrow sites but included patches of leafless subalpine stone birch forests that look very similar to windthrows. It was difficult for non-specialists to make the right delineation in this snapshot without any contextual information about the landscape nor a priori knowledge about the vegetation cover encountered in these types of sites. We used this to demonstrate possible false-positive cases that may be determined when using our approach of windthrow recognition.
We used images dated to 2015 since this corresponds to the first year after the forest wind damage occurred. For such a time span, specific patterns corresponding to windthrow patches were clearly recognized on satellite images. The natural overgrowing of disturbed forest sites by undergrowth and shrubs reduces the possibility of correctly recognizing these areas since the specific pattern of well-recognized fallen tree trunks disappears. Therefore, newly arisen windthrow areas cannot be reliably recognized after several vegetative seasons from the moment when the disturbance event occurred. For the same reason, we used images dated to the beginning of the growing season, with the exception of one part of the island in which cloudless images were available for July 2015.

Training and Validation Data
To train our model, we manually delineated masks of windthrow sites for all available images (Table 1). Original images and corresponding masks had been read as arrays of different shapes. RGB-images had shape (w, h, 3), where w and h are the number of pixels per width and height, respectively, and 3 is the number of image channels. Pixel-wise labels (mask images) had the same size as corresponding RGB images and differed only by the number of channels (which was equal to 1). The pixel of a mask image was internally assigned as 1 if it belonged to the damaged area and 0 otherwise.
Taking into account the experience of solving similar problems of segmentation of remote sensing data [39], we chose the size of the input image as equal to 256 × 256, which approximately corresponds to a 128 m × 128 m square on the ground for the Pleiades-1A/B images. This size is sufficient to figure out a specific pattern of forest-disturbed areas that is important for neural network training and is not restrictive from the perspective of the required memory. Training data were generated as batches of size (m, 256, 256, 3), where m is the batch size (m = 20 in our experiments). The batches consisted of sub-images of size 256 × 256 that were randomly cropped out from the original satellite images presented in Table 1. We had a stream (internally, a Python generator) of almost never repeated sub-images; these sub-images were combined into batches and used for the neural network training. Satellite images for sites #1, #3, #5, #7-10 were used for training and #2, #4, #6 for validation. Corresponding batches of mask data had shape (20,256,256,1). The network training assessment was performed on sub-images generated from image #2 (Table 1). Images #4 and #6 were used for visualization and demonstration of the algorithm efficacy.
Augmentation is an important part of the neural network learning process that resolves the problem of overfitting [53,54]. Original satellite images were obtained in different atmospheric conditions and had slightly different values of saturation, so we decided to use a specific augmentation technique to expand the number of training images, and thereby improve the network performance. As an augmentation transformation, we chose random changes of RGB channels of the original images and random vertical and horizontal flips. Random changes for each RGB channel did not exceed 0.1 by absolute value and were applied simultaneously to all channels, as it is implemented in the utility function "apply_channel_shift" from the Keras package [54]. Random flips provided additional variability of images used for training and reduced overfitting. We also considered using small random rotations in the augmentation pipeline. However, adding rotations did not improve the network performance, and we excluded such transformations from the augmentation. It is worth noting that we didn't use random shifts in the augmentation procedure. Such transformations would be redundant since sub-images were cropped from a fixed set of satellite images and often intersected each other that could be considered as they are spatially shifted.
Therefore, having a batch size of 20 and performing typically up to 1500 epochs for training the network, we used almost 30,000 different augmented images of size 256 × 256.

Artificial Neural Network Architecture
The problem of forest damage identification is a semantic segmentation problem. Efforts to solve such problems have recently made significant progress due to artificial neural networks of complex architecture, which are closely related to a general term referred to as deep learning [55][56][57].
Semantic segmentation is a pixel-wise classification problem aimed at determining the class of the particular pixel of the image it belongs to. It is usually handled by means of convolutional neural networks (CNNs). As was noted in the introduction, one of such CNNs is U-Net, which is of an encoder-decoder architecture [58].
U-Net can be viewed as a CNN consisting of two parts: the encoder and decoder blocks. The encoder block reduces the spatial dimensionality of the original image and learns to keep only the most important features. The decoder block performs the inverse operation. It increases the spatial dimensionality and learns to separate different parts of the original image (segmentation task). At each level of depth, the information of the encoding block is concatenated with one of the decoding blocks, which allows for the improvement of the neural network performance [58].
For this study, we used a U-Net-like CNN defined recursively as pseudocode (Python/Keras) in Algorithm A1.
The CONV_BLOCK function is a significant structural part of the U-Net-like neural network. It includes convolutional transformations, which are also part of classic U-Net. These are two consequent 2D-convolutional layers. However, in our U-Net-like architecture, we optionally included two batch normalization layers [59], a residual connection (as was done for ResNet [57]), and a dropout layer [60]. By changing the corresponding parameters, we can tune the neural network architecture and choose the best combination. The traditional U-Net architecture corresponds to the default parameters of the GET_Unet function (defined in line #22 in Algorithm A1). The proposed structure of the convolutional block was inspired by various best-practice solutions publicly available on the Kaggle platform, and the performance study carried out for the ImageNet classification problem [61]. The latter computational study states that good results for image classification problems using CNN take place when the batch normalization is applied after a 2D-convolutional layer. Since there are no reasons to put batch normalizations right after the dropout layer (it doesn't transform inputs and introduce a bias), we placed it after each convolutional layer. There are still various possible extensions to the convolutional block. The provided architecture (Algorithm A1, CONV_BLOCK) is the closest to the original U-Net solution that incorporates both the batch normalization and residual connection. Another important parameter in our CNN is layer rate. This parameter defines how the number of layers will change depending on the depth of the neural network. Its default value is 2. This means that the number of layers is multiplied by two each time we dive one level deeper through the U-Net architecture (for classic U-Net, we have 64, 128, 256, etc., as the number of layers). Therefore, we can not only tune the depth of our CNN but also choose the number of layers for each level of depth.
The neural network performance was assessed by means of overall accuracy score and the mean value of intersection over union (MeanIoU), as was implemented in Hamdi et al. [39]. The latter is widely used in semantic segmentation because it allows for the handling of class-imbalanced cases, which are common when dealing with pixel-wise classification problems. The algorithm of computation of the MeanIoU metric was the following [39]: 1) MeanIoU values were computed using the MeanIoU function from the Keras package for each threshold value in the interval [0.5, 1) with a step of 0.05; 2) all these values were stored in the array, and final average value of MeanIoU was computed.
The overall accuracy score was computed as a fraction of correctly classified pixels and their total number.

Neural Network Implementation and Tuning
The U-Net-like CNN described in Algorithm A1 was implemented in a Python-based (Python 3.7.3 was used) computational environment, which was built on top of the Keras framework [54] using Tensorflow [62] as a backend. All computations were performed on a PC with 1 GPGPU Nvidia Tesla K80 with 16 GB of RAM and required up to 10 h to train one CNN architecture.
We chose Adam [63] as an optimization algorithm to update the CNN weights and binary cross-entropy as a loss function. The latter is usually the method of choice when dealing with image segmentation and binary classification problems [64]. As stopping criteria for neural network training, we considered different approaches from setting the number of epochs to tracking for specific behaviors of the dynamically evaluated measures on the validation data. The latter approach is more progressive and allows the prevention of the CNN overfitting phenomenon (when the CNN performs significantly worse on the test data and better on the training data) by stopping the training process when performance metrics begin increasing for the training data and decreasing for the test data. In this case, the neural network starts to learn very specific patterns of the training dataset and loses the ability to generalize.
To tune the architecture of the neural network, we tested the following combinations of parameters All of the best results corresponded to the configuration when the number of layers was equal to 64, and dropout was applied. The best one, whereby an additional batch normalization is applied, corresponds to the following parameters: num_layers = 64, depth = 4, layer_rate = 2, batch_norm = True, residual = False, dropout = 0.5.
High-intensity fluctuations of the loss function shown in Figure 2, which corresponds to the best set of parameter values, are caused by the specificity of the algorithm used at the training stage. We did not have a prebuilt set of images to use for the training. We instead generated batches of training images randomly, on the fly, from source images presented in Table 1 (we randomly cropped source images to 256 × 256 resolution and applied augmentation), and we never showed exactly the same Remote Sens. 2020, 12, 1145 8 of 17 subset of images to the network. Therefore, there was only one step per epoch when training. All of that led to ambiguities in the objective function being minimized and likely caused the high values of variance. We also tried to change the default learning rate parameter for the Adam algorithm. Its decrease did not lead to a significant reduction of loss function fluctuations but increased the number of epochs required for network training. At the end of the learning process (after 1500 epochs), we obtained average values for the loss function that were less than 0.1.
High-intensity fluctuations of the loss function shown in Figure 2, which corresponds to the best set of parameter values, are caused by the specificity of the algorithm used at the training stage. We did not have a prebuilt set of images to use for the training. We instead generated batches of training images randomly, on the fly, from source images presented in Table 1 (we randomly cropped source images to 256 × 256 resolution and applied augmentation), and we never showed exactly the same subset of images to the network. Therefore, there was only one step per epoch when training. All of that led to ambiguities in the objective function being minimized and likely caused the high values of variance. We also tried to change the default learning rate parameter for the Adam algorithm. Its decrease did not lead to a significant reduction of loss function fluctuations but increased the number of epochs required for network training. At the end of the learning process (after 1500 epochs), we obtained average values for the loss function that were less than 0.1.

Comparison to Traditional Machine Learning Algorithms
In contrast to CNNs, traditional machine learning methods do not account for neighboring pixels relative to the pixel being classified at a given moment. As we could expect, such methods should have lower performance than those that take into account values in neighboring pixels (i.e., use information about the correlation between pixels). For comparison purposes, we considered the following algorithms that have been widely used to solve various supervised learning problems: (1) naive Bayes classifier [65]; (2) logistic regression with L2 regularization [66]; (3) support vector machine [67]; 4) AdaBoost [68]. We used the default implementation of these algorithms as given in the sci-kit-learn package [69]. The naive Bayes classifier comes from the preposition that considered features are independent. It is relatively simple in training and usually yields to coarse classification results. However, it is fast and can handle large amounts of data. The logistic regression belongs to the class of generalized linear models. Being linear in its nature, it usually does not surpass more advanced methods based on the ensemble methodology (e.g., random forest, boosted trees). Due to its simplicity and possibility of probabilistic interpretation of the results, it is widely used in classification problems with continuous dependent variables, when it is possible to presume a quasi-linear relationship Remote Sens. 2020, 12, 1145 9 of 17 between those and the response probability. The support vector machine builds a hyperplane that maximizes a gap between classified subsets of points. Due to the kernel trick, it can handle not only linearly separable cases. It usually leads to good and reliable results. The latest one, the AdaBoost classifier, uses an ensemble of weak learners (e.g., decision trees) and tries to combine them into a weighted sum that represents a boosted classifier, which usually is much stronger and leads to good classification results. It is usually called, as one of the best out-of-the-box classifiers [70].
These methods were trained on a randomly chosen subset of 100,000 points from the original satellite images marked as "train" in Table 1. Validation was performed by applying trained algorithms to all pixels of satellite image #2. Further, we compared results with the best solution obtained by the U-Net-like CNN proposed above ( Table 2). Therefore, in comparison to traditional machine learning algorithms (e.g., logistic regression, decision trees, and SVM), the CNN-based approach takes into account the relationships between neighboring pixels and can recognize patterns that are specific for sites of disturbed forests. Another superiority of neural networks over traditional machine learning methods is that the latter can be used when only relatively small datasets are available. For example, it was quite hard to train the SVM algorithm on a dataset consisting of >1 M points due to the increasing computational complexity of the underlying optimization algorithm [71]. At the same time, the CNN-based approach can handle large quantities of data without problems.
The entire computational workflow is presented in Figure A1 in Appendix B.

Results
The method of recognizing windthrow patches based on pretrained U-Net-like CNN has an accuracy exceeding 94%. By contrast, the pixel-wise supervised learning methods, which are usually used to solve traditional classification problems (not related to image segmentation ones), do not allow us to achieve an accuracy higher than 85% (Table 2). It is worth noting that the accuracy values significantly depend on the type of terrain, which can look like a patch of damaged forest, e.g., have a similar color. If such a type of terrain is presented on the image, it can lead to decreased accuracy of the result obtained by traditional supervised learning methods. At the same time, the CNN-based approach turns out more robust in such cases; it can "understand" the pattern specific to the damaged forest and is not weakened by false-positive decisions. The MeanIoU values (Table 2) have the same monotonic behavior as the accuracy metrics do, but are being more sensitive and able to handle imbalanced cases, which lead to their drastic decreasing for traditional machine learning methods that usually show more false-positive cases.
The last layer of our CNN has "sigmoid" as an activation function. This allows us to interpret the output image (obtained on the forward path through the network) as the pixel-wise probability that a pixel belongs to a damaged forest patch. However, probabilities are not always the desired result. For the purpose of calculating the total area of damaged forest, it makes sense to choose a threshold value in order to convert probabilities to binary values. If the value of a probability exceeds this threshold, the pixel is classified as belonging to a windthrow site. The optimal threshold value can be estimated from the threshold-vs-accuracy curve by its optimization on a grid (Figure 3). In our study, we set the threshold from 0.4 to 0.45-which is close to the middle point of a typical range of probabilities (from 0 to 0.9) for an image.
The last layer of our CNN has "sigmoid" as an activation function. This allows us to interpret the output image (obtained on the forward path through the network) as the pixel-wise probability that a pixel belongs to a damaged forest patch. However, probabilities are not always the desired result. For the purpose of calculating the total area of damaged forest, it makes sense to choose a threshold value in order to convert probabilities to binary values. If the value of a probability exceeds this threshold, the pixel is classified as belonging to a windthrow site. The optimal threshold value can be estimated from the threshold-vs-accuracy curve by its optimization on a grid (Figure 3). In our study, we set the threshold from 0.4 to 0.45-which is close to the middle point of a typical range of probabilities (from 0 to 0.9) for an image. It should be noted that the spatial borders of windthrow patches are not smooth and homogenous curves that bound a patch with 100% of fallen trees inside. Some trees in such patches can even be untouched, or there could be several partially damaged ones. As a result, these areas are highlighted with lesser intensity and have smaller probability values than those belonging to the central parts of windthrow patches (Figures 4 and 5a). It should be noted that the spatial borders of windthrow patches are not smooth and homogenous curves that bound a patch with 100% of fallen trees inside. Some trees in such patches can even be untouched, or there could be several partially damaged ones. As a result, these areas are highlighted with lesser intensity and have smaller probability values than those belonging to the central parts of windthrow patches (Figures 4 and 5a).

Discussion
Our study shows that U-Net-like CNNs are able to identify small canopy gaps consisting of a few fallen trees and vice versa: if one or more trees survived inside windthrow, it is highlighted by smaller values of probability compared to those of a pure windthrow patch. CNN's learn a pattern that is specific for a windthrow area. Therefore, we can expect that the higher the resolution of the source images, the more detailed the patterns that are included in the training dataset and the more accurate the obtained results of semantic segmentation. In particular, the use of higher-resolution images, for instance, LIDAR orthophotographs, provides data of greater detail than satellite imagery and leads to quite optimistic results [39]. As a prospective study, it would be interesting to perform a comparison analysis with images of different resolutions, for example, comparing Pleiades-1A/B imagery (resolution 0.5 m), WorldView-3 (resolution 0.31 m), and orthophotographs (resolution < 0.3 m). There are a number of problems related to forest utilization and management where images of 0.5 m can lead to quite satisfactory and acceptable results. However, there are still a lot of problems that require images of more detailed resolution. Among them are instance segmentation problems, e.g., tree counting problem, tree species detection, identification of small canopy gaps. On the other hand, the spatial extents at which unmanned aerial vehicles data are commonly acquired for vegetation mapping are generally limited to not more than a few hectares or square kilometers [47].
Hamdi et al. [39] pointed out that shadows of neighbor trees falling on windthrow patches prevent CNNs from correctly classifying such patches. Apparently, this is one of the important limitations associated with identifying the outfalls of individual trees and clarifying the boundaries of windthrow patches. If, in the case of unmanned aerial vehicles' shadows, their related problems can be effectively minimized [72], then for satellite imagery, one should pay attention to the choice of images taking into account the angle of sunlight incidence. We can also assume that the dissected relief and presence of heavily shaded slopes of the northern exposures (in the northern hemisphere) and southern slopes (in the southern hemisphere) are factors significantly impacting the algorithm outputs as well the pixel-based methods of deciphering remote sensing data.
There are several limitations to using CNNs to obtain accurate segmentation of wind-disturbed forest areas. The territory chosen by us has high landscape diversity and contains objects that, in isolation, could be mistakenly classified as windthrows within a landscape context. For example, there are coastal areas that are similar in color and include eroded slopes with banded traces of erosion (similar to tree trunks) (Figure 4). These areas could be mistakenly classified as windthrows. Another source of such false-positive errors are seashores with logs thrown out after storms. However, artifacts of such kind can be easily removed by applying masks of vegetation cover or forested areas, which exclude false-positive decisions at the post-processing stage.
Another interesting case is caused by the visual similarity of windthrow patches and areas covered by deciduous tree species in a leafless state. We used early summer images (01 June 2015), when the growing season in Kunashir Islands was just beginning. Although deciduous tree species were already leafy at low altitudes, deciduous trees such as stone birch were leafless in the subalpine zone. For such areas, our neural network led to false-positive decisions (subalpine birch forests on Figure 5b) and near-zero precision value of identification windthrow patches. It should be noted that it is extremely hard or even impossible to correctly classify such leafless forest stands without using external data. The authors became able to correctly handle this case only due to information obtained by one of them during his fieldwork observations on Kunashir Island. We should note that at the beginning of the study, Figure 5b was mistakenly delineated as one including windthrow patches. In this regard, preparing training and validation data are very important steps and should be carried out very carefully, taking into account the time elapsed since the disturbance event took place and discrepancies in phenology, which could depend on local landscape conditions.

Conclusions
Our study shows possibilities for efficient use of U-Net-like CNNs in identifying windthrow patches in south boreal forests using satellite imagery of very high resolution. In contrast to pixel-wise methods of forest damage identification based on multi-spectral imagery, CNNs are demanding to the resolution of images used and able to understand patterns specific for windthrow areas. Since such patterns are hard to discover on satellite imagery of medium spatial resolution, it is unlikely that applying of deep learning methods, in particular U-Net-like CNN, to images of satellite systems like Sentinel-2 and Landsat will lead to results better than those, obtained by existing and well-known methods of pixel-wise object identification.
The CNN's output is of a probabilistic nature and requires a threshold to be chosen to convert probabilities into a binary map. The choice of this threshold lies on the researcher's shoulders. Our experience shows that a value of threshold approximately equal to 0.4-0.45 can yield quite good segmentation results. The average accuracy that can be reached by applying the U-Net-like CNN is 94%.
However, there are areas of terrain which have trunky-like patterns that are the source of false-positive decisions, when the forest without foliage is mistakenly classified as damaged. Such cases cannot be properly classified even by a human eye and require using additional information about forest tree composition and conditions of shooting.
There are important issues which concern the requirements imposed on training and validation images. Satellite imagery should be carefully selected taking into account (1) the time elapsed from the moment of the disturbance before the vegetation regeneration process has begun and (2) the possibility of shifts in tree phenology, e.g., caused by vertical temperature gradients in mountain landscapes, which result in incorrect classification of leafless forest stands as wind-disturbed areas since corresponding images are very similar in color and structure.
Our study shows that deep learning algorithms are reliable and efficient methods for a specific pattern segmentation based on very-high-resolution satellite imagery. Such methods become more popular due to the availability of high-performance computing facilities and satellite imagery of very-high-resolution. These are of high interest not only in view of damaged forest assessment but can be applied to various problems raising in forestry management and nature conservation. In this context, there are single object detection problems, in particular, tree positioning and tree species identification problems, which can be solved using a deep learning approach. Another interesting class of problems for deep learning is the detection of drying trees that is being subjected to external impacts, e.g., fungal infections or bark beetle outbreaks. These problems are of high interest in forestry and the case for further investigations for the appliance of deep learning approaches in vegetation science.

Conflicts of Interest:
The authors declare no conflict of interest.