You are currently viewing a new version of our website. To view the old version click .
Forests
  • Article
  • Open Access

15 November 2025

Storm Damage and Planting Success Assessment in Pinus pinaster Aiton Stands Using Mask R-CNN

,
and
1
Coimbra Agriculture School, Polytechnic Institute of Coimbra, Bencanta, 3045-601 Coimbra, Portugal
2
RCM2+ Research Centre for Asset Management and Systems Engineering, Coimbra Institute of Engineering, Polytechnic Institute of Coimbra, Rua Pedro Nunes, 3030-199 Coimbra, Portugal
*
Author to whom correspondence should be addressed.
Forests2025, 16(11), 1730;https://doi.org/10.3390/f16111730 
(registering DOI)
This article belongs to the Special Issue Remote Sensing Monitoring and Analysis of Forest Structure and Function in Relation to Climate Regulation

Abstract

In Portugal, increasing wildfire frequency and severe storm events have intensified the need for advanced monitoring tools to assess forest damage and recovery efficiently. This study explores the application of deep learning neural network techniques, specifically the Mask R-CNN architecture, for the automatic detection of trees in Pinus pinaster stands using RGB and multispectral imagery captured by a drone. The research addresses two distinct forest scenarios, resulting from disturbances intensified by climate change. The first concerns the detection of fallen trees following an extreme weather event to support damage assessment and inform post-disturbance forest management. The second focuses on segmenting individual trees in a newly established plantation after wildfire to evaluate the effectiveness of ecological restoration efforts. The collected images were processed to generate high-resolution orthophotos and orthomosaics, which were used as input for tree detection using Mask R-CNN. Results showed that integrating drone-based imagery with deep learning models can significantly enhance the efficiency of forest assessments, reducing the need for fieldwork effort and increasing the reliability of the collected data. Results demonstrated high performance, with average precision scores of 90% for fallen trees and 75% for recently planted trees, while also enabling the extraction of spatial metrics relevant to forest monitoring. Overall, the proposed methodology shows strong potential for rapid response in post-disturbance environments and for monitoring the early development of forest plantations.

1. Introduction

Monitoring forest stands is essential for sustainable forest management, as it provides critical information about forest structure [,]. The analysis of forest structure, species composition, and regeneration dynamics supports operational planning and plays a vital role in evaluating ecosystem health. Information derived from these assessments is essential for enhancing forest resilience in the face of climate change [,,]. According to current climate change projections, Europe is anticipated to experience a temperature rise ranging from 2.4 °C in optimistic projections to up to 6 °C in the worst-case scenarios. These climatic shifts are expected to intensify forest disturbances, including more severe and frequent droughts, pest outbreaks, wildfires, and extreme weather events.
Portugal, with its Mediterranean climate, is particularly vulnerable to these extremes and has witnessed increasingly intense wildfires in recent decades, notably in 2003, 2017, and more recently in 2025 []. Consequently, post-fire restoration has become a national priority, and assessing the success of reforestation and natural regeneration is critical for improving recovery strategies. In addition to the prevalence of wildfires, the increasing frequency of extreme weather events, including severe storms, has further compromised forest ecosystems. These phenomena lead to significant structural damage and increased tree mortality, thereby impacting forest resilience and overall biodiversity []. These disturbances also reduce timber value and impair key ecosystem services such as carbon sequestration, reinforcing the need for advanced tools to monitor forest condition and resilience under climate change [,,].
Traditionally, forest inventories have been performed using field-based surveys, which offer high levels of accuracy. However, these traditional methods face significant limitations regarding spatial coverage, time efficiency, and accessibility. Such challenges are particularly pronounced in the aftermath of severe weather events or when assessing recently planted stands located in rugged terrain. Recent advances in remote sensing, especially through the use of unmanned aerial vehicles (UAVs) equipped with high-resolution RGB and multispectral sensors, have offered a more flexible and cost-effective alternative for forest monitoring [,,,]. UAVs enable fast and repeated data acquisition over targeted areas, facilitating the timely detection of structural changes and vegetation conditions [,].
When integrated with artificial intelligence, particularly deep convolutional neural networks (CNNs), such as Mask R-CNN [] and YOLOv8 (You Only Look Once) [], UAV imagery can support automated detection of individual trees, fallen trunks, or plantation gaps, requiring minimal human intervention while achieving high spatial accuracy [,,,,]. Recent studies have shown the potential of deep learning in automating the assessment of forest damage following storms or tornadoes. Nasimi and Wood [] employed a deep learning approach using YOLOv8x-seg to detect tornado-induced tree falls in forested regions of Kentucky and Tennessee. Their method achieved a mean average precision of over 80% in instance segmentation of fallen trees using 2 cm-resolution UAV imagery, demonstrating the effectiveness of 2D image analysis for post-disturbance mapping. Treefall detection is not the only application where this approach is valuable. Guo et al. [] found that UAV imagery with ultra-high spatial resolution is effective for recognising young trees in complex plantation environments, even where overlapping vegetation and irregular spacing are present. Candiago et al. [] also demonstrated the usefulness of vegetation indices such as NDVI and GNDVI for detecting gaps in plantations. However, is the combination of these indices with deep learning algorithms that allows accurate and automated individual tree identification at scale [,].
Despite advancements in the field, several challenges remain in applying deep learning to forestry. A major limitation is the requirement for large, annotated datasets for model training, which is difficult to achieve due to the variability of forest structures, species diversity, lighting conditions, and terrain types []. Diez et al. [] emphasised that although deep learning models generally outperform traditional approaches in image-based forestry applications, their ability to generalise across forest types, canopy structures, and illumination conditions is still limited []. Similarly, Miao et al. [] found that although multispectral UAV classifiers achieved over 93% accuracy, they still required active learning to generalise effectively across seasonal conditions.
Mask R-CNN is one of the most widely used deep learning architectures for instance segmentation in forestry applications [,,]. This model not only detects objects in an image but also generates a segmentation mask for each instance. It is particularly suited for individual tree detection due to its ability to handle overlapping crowns and produce spatially precise outputs. Kislov and Korznikov [] applied Mask R-CNN to satellite RGB orthomosaic imagery and a manually annotated dataset for detecting windthrown trees in boreal forests. The model achieved over 85% precision and recall in detecting fallen trees, outperforming manual photo interpretation and significantly reducing processing time. Han et al. [] utilized a similar approach to detect individual trees in plantations and orchards using RGB drone images processed with a CNN-based object detection framework. They reported accuracies above 90% under varying canopy densities and lighting conditions. In more recent research, Yao et al. [] combined RGB images and Digital Surface Model (DSM) layers to differentiate between live and dead trees in mixed forests stands, highlighting the potential of combining spatial structure data with instance segmentation []. These works illustrate that while traditional spectral analysis has its merits, incorporating deep learning enables fine-scale, object-based monitoring that aligns more closely with adaptative forest management goals. For example, Worachairungreung et al. [] employed deep networks (Faster R-CNN and Mask R-CNN) on UAV RGB images to detect and classify coconut trees in plantations with high accuracy, highlighting the potential of these approaches for individual tree-level monitoring.
Among the most efficient real-time object detection frameworks is YOLO (You Only Look Once), which has been successfully adapted for segmentation tasks. For instance, the study by Nasimi and Wood [] applied the YOLOv8x-seg variant to post-tornado forest damage assessment, achieving high segmentation precision (mAP > 80%) and highlighting YOLO’s capability to map fallen trees with minimal computational cost. The method reduced the need for time-consuming field assessments and enabled near-real-time storm damage estimation. Similarly, recent advances in lightweight architectures, such as YOLO-UFS [] and RSD-YOLOv8 [], have extended their applicability to real-time fire and pest detection in forestry, enabling deployment on low-power edge devices.
While the application of deep learning in forestry has demonstrated considerable potential, several methodological challenges remain unresolved. One of the primary limitations concerns data requirements, as most deep learning models demand large volumes of annotated training data, typically ranging from hundreds to thousands of labelled instances. However, the forestry sector continues to lack open access, standardised datasets suitable for robust model development and validation, as noted by Diez et al. []. A second major challenge is model generalization. Predictive performance often deteriorates when models are applied across forest types that differ in species composition, canopy density, or environmental conditions []. Soto Vega et al. [] similarly observed reduced generalisation in tropical plantations when CNNs trained on temperate forest data were transferred without adaptation.
A further complication relates to the influence of overlapping tree crowns and shadow effects, which remain problematic, particularly in dense stands, and negatively affect the performance of object detection frameworks such as Mask R-CNN and YOLO-based models []. Yao et al. [] addressed this issue by incorporating DSM layers, which helped reduce the misclassification of overlapping crowns. Another underexplored yet promising avenue lies in the integration of multispectral data. While models trained solely on RGB imagery can achieve satisfactory results, especially when combined with segmentation algorithms and vegetation indices (VIs), the fusion of spectral, structural, and contextual features could further enhance model robustness and transferability. Allen et al. [] demonstrated that RGB-based monitoring can yield high predictive accuracy when supported by VIs, confirming its potential for low-cost applications. Nevertheless, other studies [,], underscored the added value of incorporating multispectral or contextual information for improving classification performance under variable conditions [].
Addressing these challenges will require coordinated efforts to develop annotated datasets, adopt domain adaptation strategies, and integrate spectral and spatial data sources within deep learning frameworks for forest monitoring []. Building on previous research that has demonstrated the value of integrating UAV imagery and deep learning for forest monitoring, this work focuses on the specific application of these methods to Pinus pinaster stands. It addresses two key research questions: (i) Can Mask R-CNN accurately detect fallen trees in post-storm conditions using UAV imagery? (ii) Is it possible to identify individual young trees in recently planted Pinus pinaster stands using multispectral UAV data? By addressing these questions, the study aims to contribute to the development of automated, scalable tools to support forest monitoring, post-disturbance assessment, and restoration planning in Mediterranean pine-dominated landscapes.

2. Materials and Methods

2.1. Study Area

This study was conducted in two pilot sites located in central Portugal (Figure 1), both integrated in the Transform Agenda Project under the Better Forests Program, which aims to promote climate-resilient forest landscapes. The selected study areas included mixed stands dominated by Pinus pinaster, either mature or recently planted, allowing for the evaluation of two distinct forest monitoring challenges: detection of fallen trees following an extreme weather event, and identification of plantation gaps in early-stage forest regeneration.
Figure 1. Study area: Pilot site in Serra da Lousã/Pilot site in Serra do Açor. (a) Drone image of fallen trees in Serra da Lousã; (b) Drone image of recently planted trees site in Serra do Açor.
In Serra da Lousã, the focus was on detecting windthrown trees caused by storm Martinho (March 2025) in a mature stand composed primarily of Pinus pinaster, with presence of Castanea sativa, Pseudotsuga menziesii and Quercus robur. This stand is part of forest area established and managed by the Portuguese Forest Service (ICNF), characterized by a transition from monocultures to more structurally and ecologically diverse forests. The second site, located in Serra do Açor (municipality of Arganil), includes over 2500 hectares of community-managed land undergoing ecological restoration after the 2017 wildfires. The selected stand served as a case study for detecting trees in a newly established plantation and evaluating the spatial distribution and density of newly planted trees, particularly Pinus pinaster and Quercus robur.

2.2. Data Acquisition

A DJI Mavic 3 M UAV (DJI, Shenzhen, China) was employed for aerial data acquisition. This drone integrates RGB and multispectral cameras, with real-time kinematic (RTK) support and multi-constellation GNSS (GPS, Galileo, BeiDou, GLONASS), ensuring high precision georeferencing (Table 1). Before each flight, the UAV system was calibrated using its built-in GNSS and camera calibration routine, following the manufacturer’s recommended procedures to ensure geometric and radiometric accuracy.
Table 1. UAV camera specifications.
Flight altitude and speed were selected based on operational experience in forested environments. Flying at lower altitudes can hinder orthoimage generation, especially in dense, mature canopies, due to movement of foliage and branches caused by wind. To balance image quality and coverage area while considering battery limitations, flights were conducted at 120 m (the maximum legally allowed altitude for UAVs of this type in Portugal) with relatively slow speeds. High forward and side overlap (80%) further ensured sufficient matching points between images, minimizing errors in orthomosaic generation and supporting reliable object detection.
Field data were collected to serve both georeferencing and model validation. A Spectra Precision SP60 GNSS Receiver (Spectra Geospatial, Westminster, CO, USA) and a Spectra Precision MobileMapper 50 (Spectra Geospatial, Westminster, CO, USA) device were used to acquire ground control points (GCPs) and reference tree positions. Tree biometric data, including diameter at breast height (DBH) and height, were obtained using a diameter tape and a TruPulse 200B Laser Rangefinder (Laser Technology, Inc., Centennial, CO, USA). These field measurements supported model evaluation through spatial accuracy and detection metrics.

2.3. Data Processing

All imagery was processed in ESRI Drone2Map (v2025.1) software, generating orthomosaics (Figure 2). RTK corrections were applied during processing to improve image location accuracy. The “Initial Image Scale” and “Matching Neighbourhood” parameters were adjusted to enhance tie-point generation. GCPs were manually marked to further refine orthoimage alignment. Final products were projected to ETRS89/PT-TM06 and exported as GeoTIFFs for use in ArcGIS Pro (v3.5).
Figure 2. Orthomosaic created with imagery from the Açor_1 study area.

2.4. Deep Learning Workflow

Training, validation, and testing datasets were defined to ensure both model learning and independent performance evaluation, as illustrated in Figure 3. The deep learning workflow involved two stages: manual annotation and model training. Using ArcGIS Pro’s “Label Objects for Deep Learning” tool, objects of interest (fallen trees and planted seedlings) were delineated as polygons on orthomosaics. They were deliberately chosen to capture the natural variability of tree forms, colours, illumination conditions, and background complexity within the orthomosaics. These labelled datasets were exported with the “Export Training Data for Deep Learning” tool, which produced tiled imagery (typically 76 × 76 to 512 × 512 px) and metadata in R-CNN Masks format. These tiles formed the input for training instance segmentation models based on the Mask R-CNN architecture.
Figure 3. Data processing pipeline. Definition of training, validation and testing datasets.
For fallen tree detection, 10 different training datasets were prepared, each including between 29 and 44 annotated trees. Over 52,300 image tiles were produced, resulting in thousands more of tree instances knowing each tile had multiple annotated instances. The datasets varied in object density (1.13 to 2.46 trees per tile) and mask size (2.30 to 24.31 units), capturing the variability in tree crown shapes, illumination, and background complexity. The most robust dataset included 24,328 tiles and 36,590 instances.
For plantation gap detection in Serra do Açor, training datasets included between 161 and 534 labelled trees per set, generating over 66,000 images and 126,000 annotated instances. The largest dataset contributed 20,006 images and 72,674 trees. Smaller sets with higher tree densities (up to 12.73 trees/tile) were also included to improve model performance in densely planted areas. Object sizes were generally small (1.31 to 2.31 units), matching the typical scale of young trees.
All models were trained using the “Train Deep Learning Model” tool in ArcGIS Pro, using a ResNet-50 backbone. The standard Mask R-CNN implementation available in ArcGIS Pro was used without any architectural modifications, as the focus of this study was on evaluating its performance across different forest conditions rather than on model development. During training, the default data augmentation settings available in ArcGIS Pro were applied, including standard transformations such as crop, dihedral_affine, brightness, contrast, and zoom. This procedure helped to improve the model’s generalization ability by exposing it to a wider range of illumination and spectral conditions, effectively mitigating issues related to shadows, variable lighting, and other image inconsistencies. Model performance was evaluated through training/validation loss curves and average precision (AP) scores. The trained models were then applied to the full orthomosaics using the “Detect Objects Using Deep Learning” tool, with adjusted parameters such as confidence threshold (0.25–0.75), test-time augmentation (TTA), and non-maximum suppression (NMS) to optimize performance.

2.5. Model Validation and Performance Metrics

Ground-truth data were used to compute precision, recall, F1-score, and intersection-over-union (IoU) as shown in Equation (1), which served as the principal evaluation metric for segmentation accuracy.
I o U = A _ i n t e r s e c t i o n A _ u n i o n
where A_intersection is the overlapping area between predicted and reference masks, and A_union is the total combined area.
To assess the robustness and generalization ability of the models, a multi-level validation approach was implemented. Regarding the fallen trees model, validation plots (each 3600 m2) were randomly distributed across two orthomosaics: one corresponding to the training ortho and another representing a distinct site with different environmental and imaging characteristics (Figure 4).
Figure 4. Model validation plots and ground context visualization. (a) Plot located within the orthomosaic used for model training; (b) Plot located within an orthomosaic not used for model training.
These differences included variations in illumination, vegetation density, slope, and the presence of woody debris on the ground. Two validation plots were located within the training orthomosaics (total area of interest = 16.75 ha) and one within the independent orthomosaic (total area of interest = 4.14 ha). This configuration meant that approximately 4.3% of the area of interest in the training orthomosaic and 8.7% of the independent orthomosaic were used for validation, representing 5% of the total combined area. The 5% validation proportion was selected to ensure consistency with the validation approach adopted in Serra do Açor study area and to provide a compromise between spatial representativeness and computational feasibility. The chosen proportion ensured sufficient coverage of different canopy and background conditions within and beyond the training area, while keeping manual annotation manageable and consistent across datasets. The larger validation proportion in the untrained orthomosaic was intentionally adopted to better evaluate the model’s transferability to new environments. Unlike the conventional inventory approach used for the planting areas, where field validation was performed on-site, validation of the fallen tree detection model was conducted visually using the orthomosaics. This decision was based on practical constraints such as GNSS positional errors that complicated tree-to-image correspondence, limited accessibility due to steep terrain, and hazardous conditions caused by post-storm debris accumulation. Visual interpretation of the orthomosaic was deemed appropriate, as the relatively large size and distinct spectral and geometric signatures of fallen trees make them readily identifiable in high-resolution UAV imagery.
The validation plots used in the Serra do Açor area varied in size from approximately 0.04 to 0.35 ha, as they were initially established for other ongoing studies within the same experimental area. However, because these plots collectively represented about 5% of the total area of interest and were randomly distributed across the orthomosaics, they were also employed for validating the detection of individual planted seedlings model. This approach ensured spatial representativeness and allowed the assessment of model performance under realistic field and imaging conditions (Figure 5).
Figure 5. Model validation plots and ground context visualization. (ac) Plots located within the orthomosaic used for model training; (d,e) Plots located within other orthomosaics not used for model training. The figure illustrates the visual diversity of background conditions, including variations in illumination, vegetation density, and ground texture across the study area.
The highest proportion of validation area corresponded to orthomosaics not used in model training, ensuring a more robust and independent evaluation. Plots were randomly distributed within the selected areas. Unlike the validation of the fallen-tree detection model, which relied on a visual inventory from orthomosaics, this validation employed field-based ground truthing. All planted trees within the validation plots were geolocated using a GNSS antenna, and their DBH, height, and crown diameter were recorded for complementary studies.
In both case studies, performance evaluation was conducted using confusion matrices and derived metrics (Precision, Recall, and F1-score (Equations (2)–(4)), computed for three detection thresholds (0.75, 0.50, and 0.25) to assess the stability of the model’s predictions under different sensitivity settings. Additionally, Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) were calculated by comparing visual inventory counts with model detections across validation plots. The field information served as a reference to assess the detection rate of the model, allowing the quantification of true positives (TP) correctly identified, false positives (FP), and false negatives (FN) generated by the algorithm.
Precision corresponds to the proportion of detections considered correct (true positives, TP) relative to the total number of detections performed by the model (true positives + false positives, FP). It represents the reliability of the detections:
P r e c i s i o n = T P ( T P + F P )
Recall, also referred to as sensitivity, measures the ability of the model to correctly identify the objects of interest, expressed as the proportion of true positives (TP) in relation to the total number of actual occurrences (true positives + false negatives, FN):
R e c a l l = T P ( T P + F N )
The F1-score provides a balanced measure between the reliability of the detections and the model’s coverage capacity. It is particularly useful when there is an imbalance between classes or between the number of correct and incorrect detections:
F 1 s c o r e = 2 P r e c i s i o n R e c a l l ( P r e c i s i o n + R e c a l l )
These metrics were applied under three different confidence thresholds (0.75, 0.50, and 0.25), corresponding to the probability level used by the Mask R-CNN model to classify detections as true positives. Lower thresholds allowed more detections but increased the risk of false positives, while higher thresholds ensured more conservative and reliable predictions. This multi-threshold evaluation was performed to assess the stability of model performance under varying sensitivity levels, allowing a better understanding of how detection confidence influences precision and recall. By comparing results across thresholds, it was possible to identify the optimal balance between over- and under-detection, ensuring that the validation reflected realistic performance conditions for both case studies.

3. Results

3.1. Fallen Trees and Young Trees Detection

3.1.1. Fallen Trees Detection

The best-performing configuration was identified as model “V16_RCNN_400/96/20rot” (Figure 6), which applied a learning rate slice of 7.5858 × 10−6 to 7.5858 × 10−5. This model achieved the highest precision score among all experimental runs, with a value of 0.91, indicating a strong ability to correctly segment tree crowns relative to the total number of predicted instances. Notably, the model performed robustly under visually challenging conditions, including heterogeneous canopy structures and complex background textures. These results confirm both the effectiveness of the selected architecture and the adequacy of the training strategy and underscore the importance of a well-annotated dataset in supporting accurate instance segmentation in post-disturbance forest environments.
Figure 6. Results of the model V16_RCNN_400/96/20rot.
In terms of operational performance, the best-performing model configuration trained using the previously established labelled components, was evaluated not only for its detection accuracy but also for computational efficiency. The export of training data during the “label components” phase took 7 min with a 16 GB graphic card (NVIDIA RTX Ada Generation). The total training time amounted to 17 h and 49 min, reflecting the size of the dataset and the spatial resolution adopted (tile size of 400, stride of 96, and 20% rotation augmentation). Despite the relatively long training phase, the model demonstrated efficient inference capabilities: the instance segmentation stage required approximately between 8–16 min per sample (orthomosaic maps between 4 and 25 hectares). These results indicate that the model can be feasibly applied in operational contexts (Figure 7) without prohibitive computational costs.
Figure 7. Fallen trees detection with deep learning.
Additional post-processing procedures, including non-maximum suppression (NMS) and test-time augmentation, were also implemented and evaluated. These techniques yielded acceptable trade-offs between computational cost and spatial accuracy, further refining the model’s performance. Overall, the results indicate that, beyond achieving a high precision score (0.9), the model demonstrates computational viability and is well-suited for near-operational deployment in post-disturbance forest monitoring applications.

3.1.2. Recently Planted Trees Detection

The best-performing model, designated RCNN_V14_76_30_20rot (Figure 8), achieved an average precision score of 0.748. This configuration employed a learning rate slice of 9.1201 × 10−6 to 9.1201 × 10−5 and integrated rotation-based data augmentation techniques, which contributed to enhanced generalisation across heterogeneous spatial configurations and complex background textures.
Figure 8. Results of the model RCNN_V14_76_30_20rot.
Other competitive models included RCNN_V15_192_64_20rot (precision = 0.7163) and RCNN_v10_v4 (precision = 0.6813), further illustrating the sensitivity of model accuracy to both hyperparameter tuning and input preprocessing strategies such as image resolution and augmentation. By contrast, RCNN_v10 achieved the lowest precision score (0.3504), highlighting the critical role of appropriate architectural tuning and parameter configuration. (Figure 9). Collectively, these findings reinforce the suitability of the Mask R-CNN architecture, for instance, segmentation tasks in post-disturbance plantation monitoring. Moreover, they emphasize the importance of a well-annotated dataset that can capture spatially heterogeneous canopy patterns, thereby enabling effective model training and evaluation in applied forestry contexts.
Figure 9. Mask R-CNN models precision comparison.
In terms of operational feasibility, the top-performing model for detecting recently planted trees, RCNN_V14_76_30_20rot, was also evaluated for computational performance. The export of training data during the “label components” phase took 4 min and 16 s. The final training process required approximately 9 h and 46 min (using a 12 GB graphic card, largely due to the high-resolution input configuration (tile size 76, stride 30, with 20% rotation) and the generation of over 49,000 image tiles. Despite the extended training duration, inference remained computationally manageable; the instance segmentation stage, incorporating both non-maximum suppression (NMS) and test-time augmentation (TTA), required approximately 1 h and 14 min when using an RGB orthomosaic map of approximately 80 hectares.
Other model configurations presented similar performance trade-offs. For instance, RCNN_V15_192_64_20rot completed training in approximately 2 h. Although more efficient computationally, this model achieved a slightly lower precision score (0.716) compared to RCNN_V14_76_30_20rot (0.748). These results underscore the operational feasibility of deep learning-based gap detection in forestry applications, while also highlighting the critical role of tile configuration and augmentation strategies in shaping both model accuracy and processing time. Striking an appropriate balance between computational efficiency and segmentation performance is essential for the large-scale and practical deployment of such models in forest monitoring workflows.
This analysis finalises the evaluation of instance segmentation models for detecting trees in a newly established plantation in post-disturbance restoration contexts. Compared to fallen-tree segmentation, this task demanded finer spatial granularity in both annotation and model resolution, due to the smaller size and more regular spatial arrangement of tree plantations (Figure 10). Despite these challenges, the results demonstrate that Mask R-CNN architectures can adapt to such ecological conditions while maintaining computational scalability and operational relevance.
Figure 10. Tree detection on a recent plantation using Mask R-CNN.

3.2. Models’ Validation

3.2.1. Fallen Trees Detection Validation

The model demonstrated consistent and robust performance across different detection thresholds, with higher precision at stricter thresholds and increased recall at lower thresholds, reflecting the expected trade-off between false positives and false negatives. An increased detection in the training orthomosaic and high but slightly lower performance in the independent orthomosaic map, especially under lower detection thresholds (Table 2 below). In this context, the detection thresholds (0.25, 0.50, and 0.75) correspond to the confidence levels used by the Mask R-CNN model to classify predictions as true positives, as previously detailed in the Materials and Methods Section. This pattern reflects the expected reduction in precision when applying a model to new imagery with different environmental and acquisition characteristics. Nonetheless, the results demonstrate strong model robustness and generalization capability for fallen tree detection in heterogeneous forest environments.
Table 2. Metrics results from validation plots (Serra da Lousã).
The overall performance metrics averaged across the three validation plots (two within the training area: Lousã_1.1, Lousã_1.2; and one external: Lousã_2.1) are presented in Table 3 below. The model exhibited a high generalisation capability, maintaining average Precision, Recall, and F1-score values consistently above 0.75 across all detection thresholds. The best balance between precision and recall was achieved at a detection threshold of 0.50, where the model reached an average Precision of 0.895, Recall of 0.845, and F1-score of 0.855. Higher thresholds (0.90 and 0.75) maintained perfect or near-perfect precision but reduced recall, indicating more conservative detections. Conversely, the lowest threshold (0.25) slightly improved recall at the expense of precision, as expected in instance segmentation models. These results indicate that the trained model exhibits robust performance across diverse image and terrain conditions, maintaining high detection accuracy even in regions not included in the training data.
Table 3. Metrics mean result—set of plots (Serra da Lousã).

3.2.2. Recently Planted Trees Detection Validation

Table 4 below, presents the detection metrics for the five validation plots, evaluated under three sensitivity thresholds (0.25, 0.5, and 0.75).
Table 4. Metrics results from validation plots (Serra do Açor).
The plots Açor_1.1, Açor_1.2 and Açor_1.3 correspond to the same orthomosaic area where the model was trained, although they differ slightly in slope and size. The Açor_2.1 plot, however, is in a different site containing a mixture of species, including Pinus pinaster and Castanea sativa. In this case, both shrub composition and topographic conditions differ considerably, together with changes in illumination and shadow direction. Moreover, while the model was trained on a winter orthomosaic, this validation dataset was derived from a summer orthomosaic, which introduced significant differences in lighting conditions, canopy colouration, and overall scene characteristics. The last validation plot, Açor_3.1, also belongs to a distinct orthomosaic and was characterized by additional variability factors such as illumination direction, tree alignment (affecting shadow orientation), acquisition time, and the presence of felled or fallen trees, all of which alter the local context relative to the training data.
As summarized in Table 5, the mean precision, recall, and F1-score values across all plots demonstrate that the model performs robustly across varying detection thresholds, with the best overall balance observed at the 0.5 threshold (Precision = 0.88, Recall = 0.91, F1 = 0.89). The relatively low MAE (14.9) and RMSE (19.1) at this threshold further confirm that the model is both accurate and stable across diverse environmental and structural conditions.
Table 5. Metrics mean result—set of plots (Serra do Açor).

4. Discussion

This study explored the application of Mask R-CNN architectures for detecting fallen and newly planted trees using high-resolution RGB drone imagery in Pinus pinaster forest stands located in the Serra da Lousã and Serra do Açor, respectively. The results demonstrated that the proposed approach can achieve high performance, with average precision scores of 90% for fallen trees identified in the storm-damaged area and 75% for recently planted trees. These outcomes are consistent with those of Han et al. [,,,], who reported similar levels of accuracy in automatic tree counting using UAV data and deep learning models. Moreover, this method significantly reduces the need for intensive field campaigns, increases the temporal resolution of monitoring, and allows integration into forest management decision systems. The findings confirm that deep learning models can effectively identify forest anomalies with high spatial resolution and precision, even in structurally complex natural environments [].
In comparison with current state-of-the-art methods, the proposed approach demonstrates competitive performance. Precision, selected as the primary performance indicator due to its relevance in minimizing false positives in ecological applications, exceeded values reported in comparable studies. For instance, Kislov and Korznikov [] employed high-resolution satellite imagery to detect windthrow, achieving only moderate accuracy in heterogeneous forest conditions. By contrast, the drone-based Mask R-CNN model presented here attained superior precision in fallen tree detection, likely attributable to the higher spatial granularity of UAV imagery and the architecture’s capacity for class-specific segmentation. Similarly, the results for detecting newly planted trees are consistent with those of Worachairungreung et al. [], who used Mask R-CNN to classify coconut trees from UAV imagery. The model successfully delineated trees with high spatial accuracy, even when they were irregularly shaped or partially obscured by understory vegetation and shadowing. These results suggest that the method is well-suited for identifying structural anomalies in both monoculture and mixed-species forest plantations. Beyond simple tree counting, the purpose of tree detection in this study was to assess post-fire restoration success through both quantitative and qualitative indicators. In the newly planted areas, crown size and shape were used as proxies for tree vigour and early structural development, contributing to the understanding of growth dynamics and regeneration quality. This required a model capable of instance segmentation rather than bounding-box detection, as the latter does not provide sufficient spatial precision for evaluating crown morphology or overlap. Although one-stage object detectors such as YOLO are more computationally efficient, they lack the pixel-level delineation necessary for ecological metrics derived from crown geometry. Furthermore, Mask R-CNN was selected because it is expected to perform better in heterogeneous forest environments and under complex illumination or shadow conditions, where crown boundaries are irregular or partially obscured.
A further dimension explored in this study relates to the sensitivity of Mask R-CNN performance to training dataset size. By systematically varying training set sizes, we observed that model precision remained relatively stable beyond a minimal threshold, reinforcing the findings of Soto-Vega et al. [] and Weinstein et al. [], who emphasize the importance of dataset diversity over absolute volume. This study directly addresses the research questions formulated in the introduction by linking the proposed approach to the main findings on model performance and robustness. It demonstrates that the Mask R-CNN architecture can reliably detect both fallen and newly planted trees using RGB drone imagery in Pinus pinaster stands maintaining consistent performance even with different datasets sizes when supported by domain-specific data augmentation. Furthermore, it provides empirical evidence that UAV-based deep learning systems can be effectively implemented in operational forestry contexts, enabling large-scale, repeatable, and cost-efficient monitoring of structural anomalies in forest cover.
Nonetheless, certain limitations remain. The model’s precision tends to decline in areas characterised by severe shadowing, overlapping canopy structures, or dense understory vegetation, where target features may be partially or entirely obscured. Recent studies suggest that these limitations could be mitigated through the integration of complementary data sources, such as LiDAR, which can provide additional structural information to improve detection accuracy. Furthermore, although the model demonstrated good generalisation across two distinct forest sites, additional validation in more diverse ecological contexts and under varying seasonal conditions is necessary to fully assess its robustness and transferability.

5. Conclusions

This study addressed the challenge of detecting fallen and newly planted trees in Pinus pinaster forests, an essential component of timely forest management that has traditionally depended on manual, labour-intensive fieldwork. To overcome these constraints, we employed a deep learning approach based on the Mask R-CNN architecture applied to high-resolution RGB imagery acquired by UAVs. The methodology was tested in two ecologically and topographically distinct forest sites, achieving high detection precision in both scenarios.
The principal contribution of this research lies in demonstrating the feasibility of accurately identifying and mapping fallen and newly planted trees through UAV-based deep learning, generating outputs suitable for integration into operational forest monitoring workflows. Compared to conventional satellite-based methods or manual surveys, the proposed approach offers superior spatial resolution, reduced field effort, and greater temporal responsiveness, key advantages for adaptive forest management under increasingly dynamic disturbance regimes driven by climate change. In this context, developing more efficient methods for continuous forest assessment, as they enable timely and spatially explicit monitoring, is an essential foundation for adaptive forest management and long-term resilience planning in the face of ongoing climatic pressures.
Despite these strengths, certain limitations were observed. Model performance declined in areas with severe shadowing, overlapping canopy layers, or dense understory vegetation, where target features are visually obscured. In addition, while the model generalised well across the two study sites, its broader applicability to other forest types, structural conditions, and seasonal variability requires further validation beyond what has already been conducted. Future work should explore the integration of complementary data sources such as LiDAR and expand testing across diverse bioclimatic regions to assess model robustness and support wider adoption in forest monitoring and restoration planning.

Author Contributions

Conceptualization, R.S.-G. and B.F.; methodology, I.B., R.S.-G. and B.F.; software, I.B.; validation, I.B., R.S.-G. and B.F.; formal analysis, I.B.; investigation, I.B.; resources, R.S.-G. and B.F.; data curation, I.B.; writing—original draft preparation, I.B.; writing—review and editing, R.S.-G. and B.F.; visualization, I.B.; supervision, R.S.-G. and B.F.; project administration, R.S.-G. and B.F.; funding acquisition, R.S.-G. and B.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work is financed by the European Union–NextGenerationEU, project no. 34 TRANSFORM—Digital transformation of the forestry sector for a more resilient and low-carbon economy. Project cofinanced by Centro 2020, Portugal 2020, and the European Union, through the ESF (European Social Fund).

Data Availability Statement

The dataset samples are available on reasonable request from the corresponding author.

Acknowledgments

We would like to acknowledge the Coimbra Agriculture School for providing institutional, technical, and logistical support throughout the development of this work. We also would like to express our gratitude to the three anonymous reviewers for their valuable suggestions.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
DJIDa-Jiang Innovations
GNSSGlobal Navigation Satellite System
GPSGlobal Positioning System
R-CNNRegion-based Convolutional Neural Network
RSD-YOLORemote Sensing Detection-You Only Look Once
YOLO-UFSYOLO with Uncertainty Fusion Strategy

References

  1. Faias, S.P.; Beito, S.; Feliciano, D.; Páscoa, F.; Tomé, M.; Mendes, A. FORSEE: Uma Rede Europeia de Zonas Piloto Para a Avaliação de Critérios e Indicadores de Sustentabilidade Florestal. Available online: http://hdl.handle.net/10400.5/1763 (accessed on 14 October 2025). (In Portuguese).
  2. ICNF. 6. º Inventário Florestal Nacional—Relatório Final (IFN6). Instituto da Conservação da Natureza e das Florestas, 2015. Available online: https://www.icnf.pt/api/file/doc/0f0165f9df0d0bbe (accessed on 14 October 2025). (In Portuguese).
  3. Lechner, A.M.; Foody, G.M.; Boyd, D.S. Applications in remote sensing to forest ecology and management. One Earth 2020, 2, 405–412. [Google Scholar] [CrossRef]
  4. Wegler, M.; Kuenzer, C. Potential of earth observation to assess the impact of climate change and extreme weather events in temperate forests—A Review. Remote Sens. 2024, 16, 2224. [Google Scholar] [CrossRef]
  5. Simpson, N.P.; Sparkes, E.; de Ruiter, M.; Trogrlić, R.Š.; Passos, M.V.; Schlumberger, J.; Lawrence, J.; Mechler, R.; Hochrainer-Stigler, S. Advances in Complex Climate Change Risk Assessment for Adaptation. npj Clim. Action 2025, 4, 74. [Google Scholar] [CrossRef]
  6. Pereira, M.G.; Trigo, R.M.; da Camara, C.C.; Pereira, J.M.C.; Leite, S.M. Synoptic patterns associated with large summer forest fires in Portugal. Agric. For. Meteorol. 2005, 129, 11–25. [Google Scholar] [CrossRef]
  7. Diário de Coimbra. Civil Protection Recorded 5800 Incidents Across Mainland Portugal due to Storm Martinho, with Fallen Trees Among the Consequences, and Coimbra Among the Affected Areas. Diário de Coimbra. 20 March 2025. Available online: https://www.diariocoimbra.pt/2025/03/20/depressao-martinho-provocou-queda-de-arvores-e-incendios-rurais/ (accessed on 14 October 2025). (In Portuguese).
  8. Salas-González, R.; Fidalgo, B. Impacto de agentes de distúrbio nos serviços dos ecossistemas em povoamentos de pinheiro bravo na Serra da Lousã. In Geografia, Riscos e Proteção Civil: Homenagem ao Professor Doutor Luciano Lourenço; Nunes, A.N., Tavares, J.C., Ribeiro, F.M., Eds.; RISCOS—Associação Portuguesa de Riscos, Prevenção e Segurança: Coimbra, Portugal, 2021; Volume 2, pp. 213–224. ISBN 978-989-9053-05-2. [Google Scholar]
  9. Burmeister, J.-M.; Zabbarov, J.; Reder, S.; Richter, R.; Mund, J.-P.; Döllner, J. Fine-Tuning DeepForest for Forest Tree Detection in High-Resolution UAV Imagery. ISPRS Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2025, 48, 39–46. [Google Scholar] [CrossRef]
  10. Petrov, A.; Medvedev, D. Analyzing Post-Fire Vegetation Dynamics with Ultra-High Resolution Remote Sensing Data. ISPRS Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2025, 48, 1189–1196. [Google Scholar] [CrossRef]
  11. Pádua, L.; Adão, T.; Guimarães, N.; Sousa, A.; Peres, E.; Sousa, J.J. Post-fire forestry recovery monitoring using high-resolution multispectral imagery from unmanned aerial vehicles. ISPRS Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2019, 42, 301–305. [Google Scholar] [CrossRef]
  12. Food and Agriculture Organization of the United Nations (FAO). Global Forest Resources Assessment 2020 Main Report; FAO: Rome, Italy, 2020; Available online: https://www.fao.org/documents/card/en/c/ca9825en (accessed on 8 May 2025).
  13. Huang, X.; Zhang, Y.; Liu, D.; Wang, Y. Tree Species Classification from UAV Canopy Images with Deep Learning Models. Remote Sens. 2024, 16, 3836. [Google Scholar] [CrossRef]
  14. Turkulainen, E.; Hietala, J.; Jormakka, J.; Tuviala, J.; de Oliveira, R.A.; Koivumäki, N.; Karila, K.; Näsi, R.; Suomalainen, J.; Pelto-Arvo, M.; et al. Large-Area UAS-Based Forest Health Monitoring Utilizing a Hydrogen-Powered Airship and Multispectral Imaging. ISPRS Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2024, 48, 559–566. [Google Scholar] [CrossRef]
  15. Feigl, J.; Frey, J.; Seifert, T.; Koch, B. Close-range remote sensing of forest structure for biodiversity assessments: A systematic literature review. Curr. For. Rep. 2025, 11, 18. [Google Scholar] [CrossRef]
  16. Kulicki, M.; Cabo, C.; Trzciński, T.; Będkowski, J.; Stereńczak, K. Artificial Intelligence and Terrestrial Point Clouds for Forest Monitoring. Curr. For. Rep. 2025, 11, 5. [Google Scholar] [CrossRef]
  17. He, K.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN. IEEE Trans. Pattern Anl. Mach. Intell. 2020, 42, 386–397. [Google Scholar] [CrossRef] [PubMed]
  18. Ultralytics. YOLOv8 Documentation. Available online: https://docs.ultralytics.com (accessed on 2 May 2025).
  19. Moreira, F.; Ascoli, D.; Safford, H.; Adams, M.A.; Moreno, J.M.; Pereira, J.M.C.; Catry, F.X.; Armesto, J.; Bond, W.; González, M.E.; et al. Wildfire management in Mediterranean-type regions: Paradigm change needed. Environ. Res. Lett. 2020, 15, 011001. [Google Scholar] [CrossRef]
  20. Nasimi, M.; Wood, R.L. Using deep learning and advanced image processing for the automated estimation of tornado-induced treefall. Remote Sens. 2024, 16, 1130. [Google Scholar] [CrossRef]
  21. Guo, X.; Liu, Q.; Sharma, R.P.; Chen, Q.; Ye, Q.; Tang, S.; Fu, L. Tree recognition on the plantation using UAV images with ultrahigh spatial resolution in a complex environment. Remote Sens. 2021, 13, 4122. [Google Scholar] [CrossRef]
  22. Candiago, S.; Remondino, F.; De Giglio, M.; Dubbini, M.; Gattelli, M. Evaluating multispectral images and vegetation indices for precision farming applications from UAV images. Remote Sens. 2015, 7, 4026–4047. [Google Scholar] [CrossRef]
  23. Shanableh, H.; Gibril, M.B.A.; Mansour, A.; Dixit, A.; Al-Ruzouq, R.; Hammouri, N.; Lamghari, F.; Ahmed, S.M.; Jena, R.; Mohamed, T.; et al. A Comparative Analysis of Deep Learning Methods for Ghaf Tree Detection and Segmentation from UAV-Based Images. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2025, X-G, 805–812. [Google Scholar] [CrossRef]
  24. Han, P.; Ma, C.; Chen, J.; Chen, L.; Bu, S.; Xu, S.; Zhao, Y.; Zhang, C.; Hagino, T. Fast tree detection and counting on UAVs for sequential aerial images with generating orthomosaic mosaicing. Remote Sens. 2022, 14, 4113. [Google Scholar] [CrossRef]
  25. Li, C.; Li, K.; Ji, Y.; Xu, Z.; Gu, J.; Jing, W. A Spatio-Temporal Multi-Scale Fusion Algorithm for Pine Wood Nematode Disease Tree Detection. J. For. Res. 2024, 35, 1875–1888. [Google Scholar] [CrossRef]
  26. Diez, Y.; Kentsch, S.; Fukuda, M.; Caceres, M.L.L.; Moritake, K.; Cabezas, M. Deep learning in forestry using UAV-acquired RGB data: A practical review. Remote Sens. 2021, 13, 2837. [Google Scholar] [CrossRef]
  27. Zhu, D.; Yang, P. Study on the Evolutionary Characteristics of Post-Fire Forest Recovery Using Unmanned Aerial Vehicle Imagery and Deep Learning: A Case Study of Jinyun Mountain, China. Sustainability 2024, 16, 9717. [Google Scholar] [CrossRef]
  28. Miao, S.; Wang, C.; Kong, G.; Yuan, X.; Shen, X.; Liu, C. Utilizing active learning and attention-CNN to classify vegetation based on UAV multispectral data. Sci. Rep. 2024, 14, 31061. [Google Scholar] [CrossRef] [PubMed]
  29. Guimarães, A.; Valério, M.; Fidalgo, B.; Salas-Gonzalez, R.; Pereira, C.; Mendes, M. Cork oak production estimation using a mask R-CNN. Energies 2022, 15, 9593. [Google Scholar] [CrossRef]
  30. Malta, A.; Lopes, J.; Salas-González, R.; Fidalgo, B.; Farinha, T.; Mendes, M. Pinus pinaster diameter, height, and volume estimation using mask-RCNN. Sustainability 2023, 15, 16814. [Google Scholar] [CrossRef]
  31. Kislov, D.E.; Korznikov, K.A. Automatic windthrow detection using very-high-resolution satellite imagery and deep learning. Remote Sens. 2020, 12, 1145. [Google Scholar] [CrossRef]
  32. Yao, S.; Hao, Z.; Post, C.J.; Mikhailova, E.A.; Lin, L. Individual tree crown detection and classification of live and dead trees using a mask region-based convolutional neural network (mask R-CNN). Forests 2024, 15, 1900. [Google Scholar] [CrossRef]
  33. Zhang, H.; Liu, B.; Yang, B.; Guo, J.; Hu, Z.; Zhang, M.; Yang, Z.; Zhang, J. Efficient tree species classification using machine and deep learning algorithms based on UAV-LiDAR data in North China. Front. For. Glob. Change 2025, 8, 1431603. [Google Scholar] [CrossRef]
  34. Worachairungreung, M.; Kulpanich, N.; Sae-ngow, P.; Anurak, K.; Hemwan, P. Classification of coconut trees within plantations from UAV images using deep learning with faster R-CNN and mask R-CNN. J. Hum. Earth Future 2024, 5, 560–573. [Google Scholar] [CrossRef]
  35. Luo, Z.; Xu, H.; Xing, Y.; Zhu, C.; Jiao, Z.; Cui, C. YOLO-UFS: A novel detection model for UAVs to detect early forest fires. Forests 2025, 16, 743. [Google Scholar] [CrossRef]
  36. Zhang, L.; Yu, S.; Yang, B.; Zhao, S.; Huang, Z.; Yang, Z.; Yu, H. YOLOv8 forestry pest recognition based on improved re-parametric convolution. Front. Plant Sci. 2025, 16, 1552853. [Google Scholar] [CrossRef]
  37. Soto-Vega, P.J.; Torres, D.L.; Andrade-Miranda, G.X.; Costa, G.A.O.P.; Feitosa, R.Q. Assessing the generalization capacity of convolutional neural networks and vision transformers for deforestation detection in tropical biomes. ISPRS Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2024, 48, 519–525. [Google Scholar] [CrossRef]
  38. Weinstein, B.G.; Marconi, S.; Bohlman, S.A.; Zare, A.; White, E.P. Cross-site learning in deep learning RGB tree crown detection. Ecol. Inform. 2020, 56, 101061. [Google Scholar] [CrossRef]
  39. Allen, M.J.; Moreno-Fernández, D.; Ruiz-Benito, P.; Grieve, S.W.D.; Lines, E.R. Low-cost tree crown dieback estimation using deep learning-based segmentation. Environ. Data Sci. 2024, 3, e18. [Google Scholar] [CrossRef]
  40. Balestra, M.; Marselis, S.; Sangey, T.T.; Cabo, C.; Liang, X.; Mokroš, M.; Peng, X.; Singh, A.; Stereńczak, K.; Vega, C.; et al. LiDAR Data Fusion to Improve Forest Attribute Estimates: A Review. Curr. For. Rep. 2024, 10, 87–104. [Google Scholar] [CrossRef]
  41. Brullo, T.; Barnett, J.; Waters, E.; Boulter, S. The Enablers of Adaptation: A Systematic Review. npj Clim. Action 2024, 3, 128. [Google Scholar] [CrossRef]
  42. Yu, K.; Hao, Z.; Post, C.J.; Mikhailova, E.A.; Lin, L.; Zhao, G.; Tian, S.; Liu, J. Comparison of Classical Methods and Mask R-CNN for Automatic Tree Detection and Mapping Using UAV Imagery. Remote Sens. 2022, 14, 295. [Google Scholar] [CrossRef]
  43. Braga, J.R.G.; Peripato, V.; Dalagnol, R.; Ferreira, M.P.; Tarabalka, Y.; Aragão, L.E.O.C.; de Campos Velho, H.F.; Shiguemori, E.H.; Wagner, F.H. Tree Crown Delineation Algorithm Based on a Convolutional Neural Network. Remote Sens. 2020, 12, 1288. [Google Scholar]
  44. Ecke, S.; Stehr, F.; Frey, J.; Tiede, D.; Dempewolf, J.; Klemmt, H.J.; Endres, E.; Seifert, T. Towards operational UAV-based forest health monitoring: Species identification and crown condition assessment by means of deep learning. Comput. Electron. Agric. 2024, 219, 108785. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.